AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
FP8 Quantized Inference

# FP8 Quantized Inference

Qwen3 32B FP8
Apache-2.0
Qwen3-32B-FP8 is the latest 32.8B-parameter large language model in the Qwen series, supporting switching between thinking and non-thinking modes with exceptional reasoning, instruction following, and agent capabilities.
Large Language Model Transformers
Q
Qwen
29.26k
47
Qwen3 8B FP8
Apache-2.0
Qwen3-8B-FP8 is the latest version in the Qwen series of large language models, offering FP8 quantization, seamless switching between thinking and non-thinking modes, and powerful reasoning capabilities with multilingual support.
Large Language Model Transformers
Q
Qwen
22.18k
27
Qwen2.5 VL 72B Instruct FP8 Dynamic
Apache-2.0
FP8 quantized version of Qwen2.5-VL-72B-Instruct, supporting vision-text input and text output, optimized and released by Neural Magic.
Image-to-Text Transformers English
Q
parasail-ai
78
1
Llama 3.1 8B Instruct FP8
FP8 quantized version of Meta Llama 3.1 8B Instruct model, featuring an optimized transformer architecture autoregressive language model with 128K context length support.
Large Language Model Transformers
L
nvidia
3,700
21
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase